-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798
[PRIM][IR]Complete IR vjp code gen for more vjp code gen #56798
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
d5f618b
to
8bb28a3
Compare
…e/Paddle into support_mutable_attributes
def is_mutable_attribute(attr): | ||
return ( | ||
attr['typename'] in ['Scalar', 'IntArray'] | ||
and attr['support_tensor'] is True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
属于这两类'Scalar', 'IntArray',但是没有support_tensor属性的算子怎样处理的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
会变成常量处理
8cc333a
to
033acf4
Compare
… support_mutable_attributes
使用有意义的PR描述,比如该PR通过完善codegen逻辑扩量支持GPT/LLama依赖的XX个高优算子 |
); | ||
{% elif outputs|length == 1 %} | ||
return Tensor(std::make_shared<LazyTensor>(op_res)); | ||
return std::make_tuple({% for i in range(outputs|length) %}{{outputs[i].name}}{%- if i!=outputs|length - 1 -%}, {% endif %}{% endfor %}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
拆分成多个宏函数,原则上一个函数长度建议控制在50行以内,一个模块的长度不超过一屏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这儿在下个pr统一修改,先不阻塞他人工作合入一版。
{% endif %} | ||
{% endif %} | ||
{% endfor %} | ||
{% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
编译器常量推断(常量折叠)是一种比较常见技术,可以单独封装成一个可读性高的函数来表明部分代码功能
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这儿会在下个pr统一修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
{% endif %} | ||
{% endif %} | ||
{% endfor %} | ||
{% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for tests_utils.py
for attr in attrs: | ||
if ( | ||
attr['typename'] in ['Scalar', 'IntArray'] | ||
and attr['support_tensor'] is True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scalar
的类型不仅仅是Scalar
还可能是Scalar(int)
Scalar(int64_t)
等,这个函数可以借助tests_utils.py
中的
def is_scalar(s):
return re.match(r"Scalar(\(\w+\))*", s) is not None
def is_intarray(s):
return s == 'IntArray'
进行判断。
2. 新IR下可变attribute是否需要对:
attr['tensor_name'] is not None or attr['tensors_name'] is not None
进行判断。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢提醒,第一个点我理解有明确类型数据类型应该不需要修改了,第二点已经在gen.py
中进行处理了。
…e#56798) * Fix attr type error like concat axis * Fix None input error * Fix intermediate output * support vjp code gen --------- Co-authored-by: 0x45f <wangzhen45@baidu.com>
PR types
New features
PR changes
Others
Description
Pcard-66975
背景:
在PR PR56512基础上进行反向算子vjp生成扩量工作,高优支持
GPT/LLama
依赖的26个高优算子vjp.PR改动:
attribute
新IR下可变
attribute
会变成输入形式,但是组合算子组网时如果需要对attribute
值进行判断那么就要求attribute
是一个常量。因此,vjp接口可变attribute
统一为输入形式,但是在内部针对组合非组合模式进行变换,组合模式如果是可变attribute
,就会根据输入找到真正的attribute
值,非组合模式直接透传即可。拿
sum
算子举例,生成的vjp代码如下:sum组网api在微分层会生成可变
attribute
为输入和attribute
两种形式:intermediate
输出,在组网api中不输出拿
reshape
举例,xshape
作为intermediate
输出将不会输出多输入多输出:
接下来要完成的工作:
GPT/LLama
依赖高优算子vjp全量生成optional
支持以及反向空梯度语义表示